Click Here!
home account info subscribe login search My ITKnowledge FAQ/help site map contact us


 
Brief Full
 Advanced
      Search
 Search Tips
To access the contents, click the chapter and section titles.

Oracle Performance Tuning and Optimization
(Publisher: Macmillan Computer Publishing)
Author(s): Edward Whalen
ISBN: 067230886x
Publication Date: 04/01/96

Bookmark It

Search this book:
 
Previous Table of Contents Next


Unlike traditional disk drives, when you use a disk array, the data is automatically striped across all the disk drives; therefore, it is necessary to create only one tablespace and table for all your data. You do not even have to put indexes into another tablespace—although I recommend doing so for other reasons (such as monitoring and maintenance).

With traditional disk partitioning, it is difficult to manage hundreds of data files and disks; with a disk array, you can manage hundreds of disks with just a few data files. Of course, Oracle has a 2 gigabyte limitation on the size of a data file, but this is easily resolved by creating a data file for every 2 gigabytes of space you need. The data files can all reside on the same disk array volume. By splitting tablespaces into several data files with tables striped across them, all residing on the same logical volume, you can take better advantage of the Parallel Query option.

Because the data warehouse may have a large amount of data that is sparsely accessed, it is to your advantage to put many different types of archival and current data on each disk volume. By spreading out the data, the I/O load is more evenly distributed.

If you use a disk array, many of the management tasks and load balancing tasks are greatly simplified. With the disk array, you also have the option of using fault tolerance without affecting system performance. Of course, using fault tolerance requires significantly more disks.

I recommend that you use a disk array if possible. Software striping is fine, but if your system is under heavy loads (as it is with a typical data warehousing system), you can achieve better performance by offloading the striping overhead to a hardware RAID controller.

Fault Tolerance Consideration

Because the data warehouse contains so much data, you can take one of two approaches to data protection:

  Protect everything. Because there is so much data and so many disks in use, everything must be protected. The large number of disks in use increases the possibility of a disk failure. The massive amount of data increases the time needed for backup and recovery.
  Conserve cost. Because there are so many disks involved, it may be cost prohibitive to use RAID-1 or disk mirroring. When you mirror the disks, you double the number of disks.

In a data warehousing system, a good compromise is to use a fault tolerant method such as RAID-5 for the data files. You can be somewhat selective and use RAID-1 on volumes with heavy update activity and RAID-5 on volumes with more read activity. Remember that the performance penalty for RAID-5 is only on writing; you can achieve excellent read performance from RAID-5.

Hardware Considerations

When choosing hardware to use for a data warehousing system, consider these factors:

  Low user load. Not many concurrent processes/threads simultaneously access the system—unless you take advantage of the Parallel Query option.
  High I/O load. I/Os are concurrent and heavy, with mostly random I/O.
  Huge amounts of data. Data warehousing systems typically involve massive amounts of data. You must make sure that your system can support the high volumes of data you will be using.
  Low network traffic during runtime, possibly high during load. During the execution of typical decision support queries against your data warehouse, there is very little network activity. When data is being loaded or updated from other sources (possibly your OLTP systems), the network activity can be quite high.

If you can take advantage of the Oracle Parallel Query option, many different processes will use the machine at once; an SMP or MPP machine should scale very well. Because an SMP architecture uses CPUs based on the processes that are available to be run, if you always have a runnable process available for each CPU, you should see good scaling by adding additional processors. With an MPP machine, you see a similar effect but on a much larger scale.

Because there is much random access to the disks, you can benefit from a disk array. I prefer hardware striping to OS striping because hardware striping does not incur any additional overhead for the operating system and does not take up valuable CPU cycles. If hardware striping is not available, OS striping is adequate.

Network traffic may or may not be an issue to your data warehousing system. If necessary, segment the network or add faster network hardware. A network bottleneck is an easy problem to solve: simply add more and faster hardware.

Tuning Considerations

The data warehouse is tuned to allow several large processes to run at maximum throughput. There is usually no concern for response times. All the tuning tips described in Part II of this book, “Tuning the Server,” apply here. Remember that with the data warehouse, much less concern is given to user response times; throughput counts.

You may have to tune both Oracle and the server operating system. The following sections look first at Oracle and then at the server operating system.


Previous Table of Contents Next


Products |  Contact Us |  About Us |  Privacy  |  Ad Info  |  Home

Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc.
All rights reserved. Reproduction whole or in part in any form or medium without express written permission of EarthWeb is prohibited.